130 research outputs found

    Depth-first search embedded wavelet algorithm for hardware implementation

    Get PDF
    The emerging technology of image communication over wireless transmission channels requires several new challenges to be simultaneously met at the algorithm and architecture levels. At the algorithm level, desirable features include high coding performance, bit stream scalability, robustness to transmission errors and suitability for content-based coding schemes. At the architecture level, we require efficient architectures for construction of portable devices with small size and low power consumption. An important question is to ask if a single coding algorithm can be designed to meet the diverse requirements. Recently, researchers working on improving different features have converged on a set of coding schemes commonly known as embedded wavelet algorithms. Currently, these algorithms enjoy the highest coding performances reported in the literature. In addition, embedded wavelet algorithms have the natural feature of being able to meet a target bit rate precisely. Furthermore work on improving the algorithm robustness has shown much promise. The potential of embedded wavelet techniques has been acknowledged by its inclusion in the new JPEG2000 and MPEG-4 image and video coding standards

    Time-frequency shift-tolerance and counterpropagation network with applications to phoneme recognition

    Get PDF
    Human speech signals are inherently multi-component non-stationary signals. Recognition schemes for classification of non-stationary signals generally require some kind of temporal alignment to be performed. Examples of techniques used for temporal alignment include hidden Markov models and dynamic time warping. Attempts to incorporate temporal alignment into artificial neural networks have resulted in the construction of time-delay neural networks. The nonstationary nature of speech requires a signal representation that is dependent on time. Time-frequency signal analysis is an extension of conventional time-domain and frequency-domain analysis methods. Researchers have reported on the effectiveness of time-frequency representations to reveal the time-varying nature of speech. In this thesis, a recognition scheme is developed for temporal-spectral alignment of nonstationary signals by performing preprocessing on the time-frequency distributions of the speech phonemes. The resulting representation is independent of any amount of time-frequency shift and is time-frequency shift-tolerant (TFST). The proposed scheme does not require time alignment of the signals and has the additional merit of providing spectral alignment, which may have importance in recognition of speech from different speakers. A modification to the counterpropagation network is proposed that is suitable for phoneme recognition. The modified network maintains the simplicity and competitive mechanism of the counterpropagation network and has additional benefits of fast learning and good modelling accuracy. The temporal-spectral alignment recognition scheme and modified counterpropagation network are applied to the recognition task of speech phonemes. Simulations show that the proposed scheme has potential in the classification of speech phonemes which have not been aligned in time. To facilitate the research, an environment to perform time-frequency signal analysis and recognition using artificial neural networks was developed. The environment provides tools for time-frequency signal analysis and simulations of of the counterpropagation network

    Multi-scale Discriminant Saliency with Wavelet-based Hidden Markov Tree Modelling

    Full text link
    The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between centre and surround classes. Discriminant power of features for the classification is measured as mutual information between distributions of image features and corresponding classes . As the estimated discrepancy very much depends on considered scale level, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden Markov Tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, a saliency value for each square block at each scale level is computed with discriminant power principle. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multi-scale discriminant saliency (MDIS) method against the well-know information based approach AIM on its released image collection with eye-tracking data. Simulation results are presented and analysed to verify the validity of MDIS as well as point out its limitation for further research direction.Comment: arXiv admin note: substantial text overlap with arXiv:1301.396

    A new approach in solving illumination and facial expression problems for face recognition

    Get PDF
    In this paper, a novel dual optimal multiband features (DOMF) method is presented to increase the robustness of face recognition system to illumination and facial expression variations.The wavelet packet transform first decomposes image into low-, mid- and high-frequency subbands and the multiband feature fusion technique is incorporated to select the subbands that are invariant to illumination and expression variation separately.These subbands form the optimal feature sets.Parallel radial basis function neural networks are employed to classify these feature sets.The scores generated by the neural networks are combined by an adaptive fusion mechanism where the level of illumination variations of the testing image is estimated and the weights are assigned to the scores accordingly.The experimental results show that DOMF outperforms other algorithms and also achieves promising performance on illumination and facial expression variation conditions

    Non-uniform face mesh for 3D face recognition

    Get PDF
    Uniform face meshes are able to represent the face in 3D format and can also be used to perform 3D face recognition.However, to obtain a good recognition rate, a fine mesh which consists of many points would be needed to accurately represent the many contours of the face.Therefore, in this paper, it is proposed that a non-uniform face mesh is constructed for 3D face recognition. A non-uniform mesh consisting of fine meshes for the middle of the face and coarse meshes for the rest of the face was created. In comparison with a uniform mesh, the proposed non-uniform face mesh consists of much fewer points and therefore saves storage space and transmission time due to a smaller file size.Besides that, the proposed mesh was able to produce recognition rates that were only slightly lower than the uniform mesh, hence proving that important face features for recognition were retained

    Salient region detection using contrast-based saliency and watershed segmentation

    Get PDF
    Salient region detection is useful for many applications such as image segmentation, compression, image retrieval, object tracking, and machine vision systems.In this paper, an approach to detect salient regions in a visual scene using contrast-based saliency and watershed segmentation is presented.The approach allows salient objects to be detected and extracted for analysis while preserving the actual boundaries of the salient objects. The approach can be executed in parallel making it efficient for real time applications
    • …
    corecore